现代汉语语义词典多义词词库的校正和再修订(New Editing and Checking Work of the Semantic Knowledge Base of Contemporary Chinese (SKCC))[In Chinese]
نویسندگان
چکیده
This paper is rooted in the two principles and methods that should be followed by sense discrimination for Chinese language processing: Completeness and discreteness. Built on the comparison of Semantic Knowledge-base of Contemporary Chinese (SKCC) and Grammatical Knowledge base of Contemporary Chinese (GKB), supported by large scale corpus, we conducted our new editing and checking works. Firstly, we designed a novel multi-sense lexicon candidate abstraction algorithm based on lexicon comparison between SKCC and GKB. For all 1605 candidate multi-sense lexicon, we conducted editing work on the senses, explanation, and its translation Then, we built a tree structure to process a special food and plant lexicon. Thirdly, a mapping platform between SKCC and GKB has been built to help us built mapping relationships between multi-sense lexical between SKCC and GKB. Finally, we finished mapping work for all multi-sense lexicon in SKCC.
منابع مشابه
Tibetan Base Noun Phrase Identification Framework Based on Chinese-Tibetan Sentence Aligned Corpus
This paper presents an identification framework for extracting Tibetan base noun phrase (NP). The framework includes two phases. In the first phase, Chinese base NPs are extracted from all Chinese sentences in the sentence aligned Chinese-Tibetan corpus using Stanford Chinese parser. In the second phase, the Tibetan translations of those Chinese NPs are identified using four different methods, ...
متن کامل[Pondering the standardization of basic terms in traditional Chinese medicine].
近年来 ,有关中医、中草药等内容的译著、译文 层出不穷 ,引起了世界各国医学界的极大关注。在 中医翻译实践中 ,中医名词术语的翻译是核心、是首 要。这就是说 ,要与国外医学界交流信息 , 要使译文 能顺利地为外国读者理解 , 我们首先要能准确地翻 译中医名词术语 ,进而达到准确完整地翻译中医、中 草药等专著、专文的目的。 中医药名词术语的规范化 , 是中医药学一项重 要的基础性系统工程。它对于中医药现代化、国际 化 ,中医药知识的传播 , 国内外医药交流 , 学科与行 业间的沟通 ,中医药科技成果的推广使用和生产技 术的发展 ,中医药书刊和教材的编辑出版 , 特别是对 现代信息技术的发展和应用都具有十分重要而深远 的意义。它对中医药电子辞典编纂以及专家库系 统、知识库系统、机器翻译系统等具有商品价值的实 用计算机系统的研制具有推动和促进作用 , 将会产 生巨大的社会和经济效益。 由中...
متن کاملReading news for information: How much vocabulary a CFL learner should know
This paper reports the findings of a corpus-based study on the vocabulary used in journalistic Chinese. Based on a 20-million character corpus of more than 27,000 news texts collected between mid 2003 and the end of 2004 from various Chinese media sources in different countries and regions, a character frequency list and three word and phrase frequency lists with two, three and fourcharacters w...
متن کاملA Unified Framework for Discourse Argument Identification via Shallow Semantic Parsing
This paper deals with Discourse Argument Identification (DAI) from both intra-sentence and inter-sentence perspectives. For intra-sentence cases, we approach it via a simplified shallow semantic parsing framework, which recasts the discourse connective as the predicate and its scope into several constituents as the argument of the predicate. Different from state-of-the-art chunking approaches, ...
متن کاملBilingual Lexicon Construction from Comparable Corpora via Dependency Mapping
Bilingual lexicon construction (BLC) from comparable corpora is based on the idea that bilingual similar words tend to occur in similar contexts, usually of words. This, however, introduces noise and leads to low performance. This paper proposes a bilingual dependency mapping model for BLC which encodes a word’s context as a combination of its dependent words and their relationships. This combi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015